-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Add guards to prevent OOB when streaming buffers with ConditionalDimension #2150
compiler: Add guards to prevent OOB when streaming buffers with ConditionalDimension #2150
Conversation
for i in range(u.save): | ||
u.data[i, :] = i | ||
|
||
expr = u.dt2 + 3.*u.dt(x0=time_sub - time_sub.spacing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this is just u.dtl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to know - I adapted the code from the next test there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, is there some standard cond dim cases that were broken as well non related to your issue?
@@ -589,7 +589,7 @@ def test_conddim_w_shifting(): | |||
op1 = Operator(eqns, opt='buffering') | |||
|
|||
# Check generated code | |||
assert len(retrieve_iteration_tree(op1)) == 3 | |||
assert len(retrieve_iteration_tree(op1)) == 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what changed in the code since it seems to change existing tests not related to init_to_host
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests also generate initialisation code that is affected by the addition of the guard. The loop:
if (time_m <= time_M)
{
/* Begin section0 */
START_TIMER(section0)
for (int db0 = 0; db0 <= 0; db0 += 1)
{
for (int x = x_m; x <= x_M; x += 1)
{
for (int y = y_m; y <= y_M; y += 1)
{
ub0[(db0 + (time_M / factor))%(1)][x][y] = u[db0 + (time_M / factor)][x][y];
}
}
}
STOP_TIMER(section0,timers)
/* End section0 */
}
becomes:
if (time_m <= time_M)
{
/* Begin section0 */
START_TIMER(section0)
for (int db0 = 0; db0 <= 0; db0 += 1)
{
if (0 <= db0 + (time_M / factor))
{
for (int x = x_m; x <= x_M; x += 1)
{
for (int y = y_m; y <= y_M; y += 1)
{
ub0[(db0 + (time_M / factor))%(1)][x][y] = u[db0 + (time_M / factor)][x][y];
}
}
}
}
STOP_TIMER(section0,timers)
/* End section0 */
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that what causes the bump from 3 to 4 iteration nests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the iteration nest before adding the guard:
IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>, <WithProperties[affine,parallel]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel]::Iteration y; (y_m, y_M, 1)>)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>,)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration y; (y_m, y_M, 1)>)
and after adding the guard:
IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>,)
IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>, <WithProperties[affine,parallel]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel]::Iteration y; (y_m, y_M, 1)>)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>,)
IterationTree(<WithProperties[affine,sequential]::Iteration time[t0,t1]; (time_m, time_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration x; (x_m, x_M, 1)>, <WithProperties[affine,parallel,parallel=]::Iteration y; (y_m, y_M, 1)>)
Iteration tree IterationTree(<WithProperties[affine,parallel]::Iteration db0; (0, 0, 1)>,)
appears as a result. Not sure if this is expected or if you want me to probe into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it's OK!
@mloubout Issue popped up with personal code that does something similar to this, so I haven't seen anything else that broke |
Codecov Report
@@ Coverage Diff @@
## master #2150 +/- ##
==========================================
- Coverage 87.10% 87.07% -0.04%
==========================================
Files 223 223
Lines 39893 39916 +23
Branches 5172 5173 +1
==========================================
+ Hits 34750 34758 +8
- Misses 4567 4580 +13
- Partials 576 578 +2
|
devito/passes/clusters/buffering.py
Outdated
@@ -177,6 +179,9 @@ def callback(self, clusters, prefix, cache=None): | |||
lhs = b.indexed[dims]._subs(dim, b.firstidx.b) | |||
rhs = b.function[dims]._subs(dim, b.firstidx.f) | |||
|
|||
if dim.is_Conditional: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about we place the guard regardless of whether it's a ConditionalDimension or a plain Dimension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
second question.
Since db0
starts, by construction, at 0 and can only take positive values, how about this as a guard instead:
guards[pd] = GuardBound(0, b.firstidx.f - b.xd) # aka: time_M / factor >= 0
then later on, you would have to use andg
to chain this guard with the other time_M >= time_m
guard
this would generalise to the no-ConditionalDimension case as well
this way, we would generate neater code in my opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something, but I think the guard does need to be on db0
- we need to check that db0 + (time_m / factor) - 1 >= 0
or db0 + (time_M / factor) - 1 >= 0
depending on whether it's forward or backward.
Using guards[pd] = GuardBound(0, b.firstidx.f - b.xd) # aka: time_M / factor >= 0
tests fail as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right I just got confused
However, I still think this generalisation is feasible:
what about we place the guard regardless of whether it's a ConditionalDimension or a plain Dimension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that should not be a problem. It will make more tests trip on the number of iteration trees, but happy to make the changes
devito/passes/clusters/buffering.py
Outdated
@@ -226,8 +231,13 @@ def callback(self, clusters, prefix, cache=None): | |||
ispace = b.readfrom | |||
properties = c.properties.sequentialize(d) | |||
|
|||
guards = c.guards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpicking: guards are typically computed before properties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all these changes. Appreciate that. Approved!
Rebase and GTG |
5903ad1
to
746afeb
Compare
Many thanks! Merging... |
Using the time derivative of a streamed buffer with a ConditionalDimension results in OOB within
init_to_host
. This was not captured in existing tests.A new test has been added
test_gpu_common.py::test_streaming_multi_input_conddim_foward
that fails with this OOB; and a new guard is now introduced inBuffering
to prevent it.